166
https://upload.wikimedia.org/
wikipedia/commons/7/70/Aminoacids_table.svg
(determined by BLAST sequence comparison with high sequence similarity to a biochem
ically verified protein) should then also have a nuclear localization signal (determined, for
example, by the ELM server, the “eukaryotic linear motif server”), and the domain com
position (determined by the SMART database; Letunic et al. 2015) should confirm the
transcription factor found by a DNA-binding domain. After all, everything has to match
because we always assumed the same sequence. Conversely, the different bioinformatics
algorithms check and correct each other. In a living cell, the domains in the protein have
to fit together correctly.
Learning to better understand this genetic “language of life” was, at least for me, a
major reason to learn bioinformatics – and the computer is only one, albeit very powerful,
tool for this.
Another way to approach this aspect of the language of life is through the proteins
themselves. Their richness can be viewed directly with the Pfam database (all protein
families; pfam.xfam.org) or UniProt (database of all known proteins and protein sequences;
https://www.uniprot.org). This makes it much easier to understand the huge number of different
12 Life Continuously Acquires New Information in Dialogue with the Environment